Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All deprecations on AtomGroup and friends getters and setters. #698

Merged
merged 19 commits into from
Apr 10, 2016

Conversation

dotsdl
Copy link
Member

@dotsdl dotsdl commented Feb 6, 2016

The new topology system for #363 will arrive in 0.16.0, and this removes
redundant getters and setters of various Atom, Residue, and
Segment properties. These methods predated the properties that replace
them, and this commit features deprecation warnings for each with
instructions on what to use instead.

Some Atom properties, such as number, pos, serials, have themselves been
replaced in the new scheme (index, position, id, respectively).
The centroid method has also been removed.

The methods AtomGroup.translate and AtomGroup.rotateby no longer
take tuples of AtomGroup objects as optional arguments, but only take
explicit vectors. Because Residue, ResidueGroup, Segment, and
SegmentGroup are no longer AtomGroups themselves, leaving this in
creates a minor convenience at the cost of method complexity. The
documentation is correspondingly complex, and in the course of updating
the docstrings to numpy-style for #363, simplifying the method also
simplified the documenation.

Also deprecated as_Universe function. This has yet to be discussed on
the issue-363 branch, but discussion should happen on what its purpose
is.

Also addressed in this commit:

  • In Move from Atom based topology to array based #363, __getattr__ behavior for the Groups
    have changed to a slightly different scheme. No warnings have been added
    for this change to the __getattr__ methods yet. The new scheme goes
    as::

    SegmentGroups getattr can select by segment name
    Segment getattr can select by residue name
    ResidueGroup getattr can select by residue name
    Residue getattr can select by atom name
    AtomGroup getattr can select by atom name

Should only require the following in AtomGroup.__getattr__, or
pehaps override in ResidueGroup and add after a super:

if isinstance(self, ResidueGroup):
    warnings.warn("In version 0.15.0 this will select "
            "residue names, not atom names ", DeprecationWarning)
  • Because Residue, ResidueGroup, Segment, and
    SegmentGroup are no longer AtomGroups themselves, there are methods
    such as Segment.names that now longer exist (one would use
    Segment.atoms.names instead). Something like:
if isinstance(self, Segment):
    warnings.warn("In version 0.15.0, use `segment.atoms.names "
                  "instead.", DeprecationWarning)

will be needed. And since calling SegmentGroup.names no longer gives a single numpy array of all atom names in each segment shoved together, but instead a list of arrays, one for each segment, this will need to be noted. A decorator will be best.

  • I left bonds things alone as it's not my area of expertise. Any deprecations on that front should be placed by @richardjgowers.

The new topology system for #363 will arrive in 0.15.0, and this removes
redundant getters and setters of various `Atom`, `Residue`, and
`Segment` properties. These methods predated the properties that replace
them, and this commit features deprecation warnings for each with
instructions on what to use instead.

Some `Atom` properties, such as `number`, `pos`, `serials`, have themselves been
replaced in the new scheme (`index`, `position`, `id`, respectively).
The centroid method has also been removed.

The methods `AtomGroup.translate` and `AtomGroup.rotateby` no longer
take tuples of `AtomGroup` objects as optional arguments, but only take
explicit vectors. Because `Residue`, `ResidueGroup`, `Segment`, and
`SegmentGroup` are no longer `AtomGroup`s themselves, leaving this in
creates a minor convenience at the cost of method complexity. The
documentation is correspondingly complex, and in the course of updating
the docstrings to numpy-style for #363, simplifying the method also
simplified the documenation.

Also deprecated `as_Universe` function. This has yet to be discussed on
the `issue-363` branch, but discussion should happen on what its purpose
is.

Not addressed in this commit:

1) In #363, `__getattr__` behavior for the Groups
have changed to a slightly different scheme. No warnings have been added
for this change to the `__getattr__` methods yet. The new scheme goes
as::

    SegmentGroups __getattr__ can select by segment name
    Segment __getattr__ can select by residue name
    ResidueGroup __getattr__ can select by residue name
    Residue __getattr__ can select by atom name
    AtomGroup __getattr__ can select by atom name

Should only require the following in `AtomGroup.__getattr__`, or
pehaps override in `ResidueGroup` and add after a `super`:

```python
if isinstance(self, ResidueGroup):
    warnings.warn("In version 0.15.0 this will select "
            "residue names, not atom names ", DeprecationWarning)
```

2) I left bonds things alone as it's not my area of expertise. Any
deprecations on that front should be placed by @richardjgowers.

3) Because `Residue`, `ResidueGroup`, `Segment`, and
`SegmentGroup` are no longer `AtomGroup`s themselves, there are methods
such as `Segment.names` that now longer exist (one would use
`Segment.atoms.names` instead). Something like:

```python
if isinstance(self, Segment):
    warnings.warn("In version 0.15.0, use `segment.atoms.names "
                  "instead.", DeprecationWarning)
```

will be needed.
The methods `phi_selection`, `psi_selection`, `omega_selection`, and
`chi1_selection` rely on particular atom names and the contiguity of
resids to work properly. Although these are not bad assumptions for most
topologies with a protein, they are inherently fragile. They have not
been included in the new topology system addressing #363.
@orbeckst
Copy link
Member

orbeckst commented Feb 8, 2016

Quick comment: I would want to keep the "slot in groups of atoms for translate and rotate" because I don't want to needlessly break old functionality.
Related note: all the new XGroups should still mostly behave like old AtomGroup (e.g. centroid()).

@orbeckst orbeckst mentioned this pull request Feb 8, 2016
2 tasks
@dotsdl
Copy link
Member Author

dotsdl commented Feb 8, 2016

@orbeckst I will reinstate the conveniences built into these transformation methods.

@@ -3336,6 +3378,7 @@ def __init__(self, name, id, atoms, resnum=None):
##if not Residue._cache.has_key(name):
## Residue._cache[name] = dict([(a.name, i) for i, a in enumerate(self._atoms)])

@deprecate(message=_FIFTEEN_DEPRECATION)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these not get moved across?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@richardjgowers

I think these things are really fragile. They depend on particular atom names (however customary), and the implementation also seems a bit frail, depending on residue id being in order for the chain. I'd rather mark them as deprecated and then reverse the decision later if we think they can be included with confidence.

They've been included for now among the Atomnames TopologyAttr in the 363 branch (dbe6d06), but they don't work at the moment since they depend on Residue __getitem__ working with atom names in a funny way. This isn't implemented yet, and I'm not sure it should be, but it is possible. See __getattr__ for Residues for an example of how.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Fragile" is a very relative term. They work for pretty much anyone who cares about them, i.e. if you have protein simulations then there's a very good chance that this works because the naming is de-facto standard for the backbone atoms (termini is more varied but not as varied between the force fields and PDB that it wouldn't make sense to catch them). It's easy to take the developer's high-road and only provide functionality that "never ever will break" (haha) but that misses the point that most users are quite happy to get the work done in 90% of cases and would simply appreciate a good error message in the remaining 10%.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... so, I don't want to deprecate them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah rather leave a note in the doc string so people can look into it if it fails.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I'll remove these from deprecation. @richardjgowers we'll probably need to redo the implementation of these under the new scheme.

@dotsdl
Copy link
Member Author

dotsdl commented Feb 18, 2016

Retargeted to 0.14.1. Between now and BPS I have zero time to finish this up, and I think we need a bit more progress on issue-363 given discussion in #599 to make it clear exactly what is changing as much in advance as possible.

@khuston
Copy link
Contributor

khuston commented Mar 16, 2016

I found a potential use for as_Universe, though it's not very elegant. If the user wants to manually add a topology attribute, they can

u.add_TopologyAttr(mda.core.topologyattrs.Masses(np.ones(len(u.atoms))))

And then u.atoms.masses works, but u.atoms[0].mass results in AttributeError because the masses are not propagated from TopologyGroup to TopologyObjects. This is fixed if instead the user does

u.add_TopologyAttr(mda.core.topologyattrs.Masses(np.ones(len(u.atoms))))
u = mda.core.universe.as_Universe(u)

Then both u.atoms.masses and u.atoms[0].mass work. This feels a bit hackish, and perhaps something like as_Universe should be performed after the user-accessible add_TopologyAttr is called.

Correct me if there is a better way of manually adding an attribute to the topology that gets propagated down to the individual atoms.

@richardjgowers
Copy link
Member

@khuston hmm, adding the attribute should propogate it without your hack... I'll have a poke around

@khuston
Copy link
Contributor

khuston commented Mar 16, 2016

I'm sorry -- I'm mistaken, and the as_Universe is unnecessary. I must have been using the wrong version when I tried that. @richardjgowers

@kain88-de
Copy link
Member

@dotsdl what is still needed to get the deprecations merged?

@dotsdl dotsdl mentioned this pull request Mar 21, 2016
@dotsdl
Copy link
Member Author

dotsdl commented Mar 21, 2016

@kain88-de I'll be addressing the comments made here and elsewhere (#599) and have this PR ready for merge by the end of the week. I've got this whole week for development, and this is among the things at the top of my list.

Also added warning that getattr on ResidueGroups will work only for
residue names, not atom names.
@dotsdl
Copy link
Member Author

dotsdl commented Mar 28, 2016

Added more deprecations; see top comment for updates. Ready for review again and hopeful for merge.

@@ -457,8 +457,84 @@
# And the return route
_SINGULAR_PROPERTIES = {v: k for k, v in _PLURAL_PROPERTIES.items()}

_FIFTEEN_DEPRECATION = "This will be removed in version 0.15.0"
_FIFTEEN_DEPRECATION = "This will be removed in version 0.16.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you run a global replace to _SIXTEEN_DEPRECATION? In case you are using emacs and projectile the short cut it C-p r

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just in this file since all changes are in one file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'll make the change.

self.name = x

@property
@deprecate(message="{}; use `segid` property instead".format(_SIXTEEN_DEPRECATION))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is git messing up or are you really adding deprecated features here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think git's messing up. I only added the segid property, which for the moment refers to segment name internally.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since id also uses name (see left of screen), it would be really bad if name wasn't already there.

@richardjgowers
Copy link
Member

So bonds won't work any differently in the future, apart from the fact they're now only accessible from Atom/AtomGroup. I've added deprecations for this

@dotsdl
Copy link
Member Author

dotsdl commented Mar 29, 2016

Okay, so I hope nobody uses Atom.id as they currently exist because they apparently get assigned by residues. There is no corresponding mechanism in the new system (there shouldn't be; it's very unnecessary and just asking for staleness issues).

I'm shoring up the suggestion to use AtomGroup.ids instead of AtomGroup.serials in the deprecation warning by giving the serials back if they exist, but otherwise giving back AtomGroup.indices. I'm also removing the setting of Atom.id by residues as they are built because we'd like this to work the same for individual Atoms as it does for AtomGroup.

dotsdl and others added 3 commits March 29, 2016 15:06
In the new topology scheme, atomid is just a label present defined by
the topology format (gro and PDB files have them) that don't need to
start from 0 or be unique. Atoms are uniquely identified by atom *index*
instead, which is mostly internal to the topology system but is unique
between atoms and starts from 0.

Given this, we make atom id in the old system roughly behave as it will
for the new scheme. Strangely, atom ids up until this point were defined
by residues. I have no idea why, or what a good reason might have been
for this.
@dotsdl
Copy link
Member Author

dotsdl commented Mar 30, 2016

Alright, we should be set, then. Ready to merge if there is nothing further.

@kain88-de
Copy link
Member

Looks good to me. I've you add it to the CHANGELOG I think we are good to go.

Since all deprecations already have a replacement I think @tylerjereddy could make changes for a ten2sixteen script that people can already use to be ahead of the curve.

@dotsdl
Copy link
Member Author

dotsdl commented Apr 1, 2016

Hold up...I missed some things. For one thing, Residue doesn't have resid or resname, but instead using id and name. sigh

Adding properties for these things, and the appropriate deprecation for the alternative that should be used.

… respectively

Since Residues are an AtomGroup subclass here, a Residue has e.g. `resids`,
which gives per-atom resids (all the same, *hopefully*). This doesn't
exist in #363, so added proper deprecations. Ditto for segment
properties that spit out per-residue results.

Also, since we don't want our own recommendations for setters using the
deprecated setters internally, did simple copy-paste of setter code to
appropriate property.

Also deprecated Residue.id, Residue.name, and added Residue.resid and
Residue.resname as the proper alternatives.

Still getting deprecation warnings on loading a PSF+DCD Universe. Next
up is checking all the readers for internal use of deprecated
properties/methods.
@dotsdl
Copy link
Member Author

dotsdl commented Apr 1, 2016

So, did a lot in the last commit (b399e18). See log for details.

What needs doing: we get deprecation warnings on load of a Universe with a PSF + DCD. This is probably because the readers for these might be calling deprecated components of MDAnalysis.core.AtomGroup.py. Could someone that's not me try loading up a bunch of different Universes and changing the problem spots to using our "recommended" accessors/getters?

tldr: our own code shouldn't throw deprecation warnings. Only user code should.

@richardjgowers richardjgowers self-assigned this Apr 2, 2016
@richardjgowers
Copy link
Member

Ok I'm reviewing this

@tylerjereddy
Copy link
Member

So, we want a separate ten2sixteen or we want to update the old one to have this new name instead of ten2eleven? Are the changes all summarized nicely in a single place? I guess a lot of them are mentioned or can be deduced from this PR, though end users would likely want a summary somewhere anyway.

@richardjgowers
Copy link
Member

@dotsdl loading PSF/DCD universe doesn't throw warnings now, except.....

The anchoring stuff. I read through it all, and there's no reason why it won't work in the new scheme. I'm not sure how useful it is to pickle AtomGroups? but it should still work as best I can tell. Maybe remove these deprecations and raise an Issue if we want to axe this feature?

@dotsdl
Copy link
Member Author

dotsdl commented Apr 2, 2016

@richardjgowers I don't think it makes any sense to pickle AtomGroups in the new scheme, since they are useless without a Topology, and therefore really a Universe. I think the idea of pickling AtomGroups should be abandoned in favor of making Universe pickle just fine so it can be used directly as arguments for functions using multiprocessing. Of course this means that we must be clear what pickling a Universe means (probably fully serialized Topology (#643) plus paths to trajectory(s)).

But that's another discussion. Let's de-deprecate the anchoring machinery so it stops throwing warnings from internal use. I doubt users are using this stuff directly, though I could be surprised.

We should put a deprecation warning on AtomGroup's __getstate__ and __setstate__ methods if we want to discourage pickling them.

@richardjgowers
Copy link
Member

@tylerjereddy we started making a list here but it needs expanding a lot. I'll try and fill it out today

The anchor mechanisms of a Universe deserve further discussion. They are
used for AtomGroup pickling/unpickling, which may or may not make sense
under the new system. De-deprecating for now, though.
Also, moved warning decorator outside of cached decorator if present so
it displays every time.

We should now be good to go here.
@dotsdl
Copy link
Member Author

dotsdl commented Apr 4, 2016

Okay, after some last fixes on some omissions, I think we're finally ready. CHANGELOG updated, too. These deprecation warnings should help us out when adapting the existing tests to the new API conventions, since currently the test suite throws them everywhere.

@orbeckst
Copy link
Member

orbeckst commented Apr 5, 2016

On 2 Apr, 2016, at 09:46, Tyler Reddy wrote:

So, we want a separate ten2sixteen or we want to update the old one to have this new name instead of ten2eleven?

Assuming that there's no harm in running "ten2sixteen" on >0.11 code, I'd say make a new ten2sixteen and remove ten2eleven ... we just mustn't forget to update wiki etc.

@dotsdl
Copy link
Member Author

dotsdl commented Apr 7, 2016

Anything else to address here? Otherwise can we merge?

@richardjgowers richardjgowers merged commit d73c18a into develop Apr 10, 2016
@kain88-de
Copy link
Member

This results in a lot of warning messages now about deprecations. It should be easy enough to grep for the words and replace them but I lost track what they all were. @dotsdl can you make a list of the deprecated methods and their preferred alternative, then I can change them across the whole codebase (emacs has a function for that 😉)

@kain88-de
Copy link
Member

Plus I get deprecation warning when I use the preferred methods ag.positions triggers a warning that get_positions is deprecated.

@richardjgowers
Copy link
Member

@kain88-de I'm working on this a little. positions is because it goes through get_positions, so it needs to get redone so that get_positions goes via positions. Ideally we should be able to run the tests with no deprecation warnings

@@ -1143,6 +1234,7 @@ def n_segments(self):
return len(self.segments)

@property
@warn_atom_property
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which one should be used instead in the future?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*Group.atoms.indices

Indices always refer to atom indices (zero-based, and always unique among atoms), no matter what level called from. In order to get back indices as they currently work from all levels (except AtomGroup) in the new system, one must prefix with .atoms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants